69 research outputs found

    Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs

    Full text link
    Graph neural networks (GNNs), as the de-facto model class for representation learning on graphs, are built upon the multi-layer perceptrons (MLP) architecture with additional message passing layers to allow features to flow across nodes. While conventional wisdom commonly attributes the success of GNNs to their advanced expressivity, we conjecture that this is not the main cause of GNNs' superiority in node-level prediction tasks. This paper pinpoints the major source of GNNs' performance gain to their intrinsic generalization capability, by introducing an intermediate model class dubbed as P(ropagational)MLP, which is identical to standard MLP in training, but then adopts GNN's architecture in testing. Intriguingly, we observe that PMLPs consistently perform on par with (or even exceed) their GNN counterparts, while being much more efficient in training. This finding sheds new insights into understanding the learning behavior of GNNs, and can be used as an analytic tool for dissecting various GNN-related research problems. As an initial step to analyze the inherent generalizability of GNNs, we show the essential difference between MLP and PMLP at infinite-width limit lies in the NTK feature map in the post-training stage. Moreover, by examining their extrapolation behavior, we find that though many GNNs and their PMLP counterparts cannot extrapolate non-linear functions for extremely out-of-distribution samples, they have greater potential to generalize to testing samples near the training data range as natural advantages of GNN architectures.Comment: Accepted to ICLR 2023. Codes in https://github.com/chr26195/PML

    Lyman-{\alpha} polarization from cosmological ionization fronts: I. Radiative transfer simulations

    Full text link
    In this paper, we present the formalism of simulating Lyman-α\alpha emission and polarization around reionization (zz = 8) from a plane-parallel ionization front. We accomplish this by using a Monte Carlo method to simulate the production of a Lyman-α\alpha photon, its propagation through an ionization front, and the eventual escape of this photon. This paper focuses on the relation of the input parameters of ionization front speed UU, blackbody temperature TbbT_{\rm bb}, and neutral hydrogen density nHIn_{\rm HI}, on intensity II and polarized intensity PP as seen by a distant observer. The resulting values of intensity range from 3.18×10143.18\times 10^{-14} erg/cm2^{2}/s/sr to 1.96×1091.96 \times 10^{-9} erg/cm2^{2}/s/sr , and the polarized intensity ranges from 5.73×10175.73\times 10^{-17} erg/cm2^{2}/s/sr to 5.31×10125.31 \times 10^{-12} erg/cm2^{2}/s/sr. We found that higher TbbT_{\rm bb}, higher UU, and higher nHIn_{\rm HI} contribute to higher intensity, as well as polarized intensity, though the strongest dependence was on the hydrogen density. The dependence of viewing angle of the front is also explored. We present tests to support the validity model, which makes the model suitable for further use in a following paper where we will calculate the intensity and polarized intensity power spectrum on a full reionization simulation.Comment: 29 pages, 13 figures, to be submitted to JCA

    Advective Diffusion Transformers for Topological Generalization in Graph Learning

    Full text link
    Graph diffusion equations are intimately related to graph neural networks (GNNs) and have recently attracted attention as a principled framework for analyzing GNN dynamics, formalizing their expressive power, and justifying architectural choices. One key open questions in graph learning is the generalization capabilities of GNNs. A major limitation of current approaches hinges on the assumption that the graph topologies in the training and test sets come from the same distribution. In this paper, we make steps towards understanding the generalization of GNNs by exploring how graph diffusion equations extrapolate and generalize in the presence of varying graph topologies. We first show deficiencies in the generalization capability of existing models built upon local diffusion on graphs, stemming from the exponential sensitivity to topology variation. Our subsequent analysis reveals the promise of non-local diffusion, which advocates for feature propagation over fully-connected latent graphs, under the assumption of a specific data-generating condition. In addition to these findings, we propose a novel graph encoder backbone, Advective Diffusion Transformer (ADiT), inspired by advective graph diffusion equations that have a closed-form solution backed up with theoretical guarantees of desired generalization under topological distribution shifts. The new model, functioning as a versatile graph Transformer, demonstrates superior performance across a wide range of graph learning tasks.Comment: 39 page

    DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

    Full text link
    Real-world data generation often involves complex inter-dependencies among instances, violating the IID-data hypothesis of standard learning paradigms and posing a challenge for uncovering the geometric structures for learning desired instance representations. To this end, we introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states that progressively incorporate other instances' information by their interactions. The diffusion process is constrained by descent criteria w.r.t.~a principled energy function that characterizes the global consistency of instance representations over latent structures. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs, which gives rise to a new class of neural encoders, dubbed as DIFFormer (diffusion-based Transformers), with two instantiations: a simple version with linear complexity for prohibitive instance numbers, and an advanced version for learning complex structures. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks, such as node classification on large graphs, semi-supervised image/text classification, and spatial-temporal dynamics prediction.Comment: Accepted by International Conference on Learning Representations (ICLR 2023

    CD8(+) T Cells Involved in Metabolic Inflammation in Visceral Adipose Tissue and Liver of Transgenic Pigs

    Get PDF
    Anti-inflammatory therapies have the potential to become an effective treatment for obesity-related diseases. However, the huge gap of immune system between human and rodent leads to limitations of drug discovery. This work aims at constructing a transgenic pig model with higher risk of metabolic diseases and outlining the immune responses at the early stage of metaflammation by transcriptomic strategy. We used CRISPR/Cas9 techniques to targeted knock-in three humanized disease risk genes, GIPR(dn) , hIAPP and PNPLA3(I148M) . Transgenic effect increased the risk of metabolic disorders. Triple-transgenic pigs with short-term diet intervention showed early symptoms of type 2 diabetes, including glucose intolerance, pancreatic lipid infiltration, islet hypertrophy, hepatic lobular inflammation and adipose tissue inflammation. Molecular pathways related to CD8(+) T cell function were significantly activated in the liver and visceral adipose samples from triple-transgenic pigs, including antigen processing and presentation, T-cell receptor signaling, co-stimulation, cytotoxicity, and cytokine and chemokine secretion. The similar pro-inflammatory signaling in liver and visceral adipose tissue indicated that there might be a potential immune crosstalk between the two tissues. Moreover, genes that functionally related to liver antioxidant activity, mitochondrial function and extracellular matrix showed distinct expression between the two groups, indicating metabolic stress in transgenic pigs' liver samples. We confirmed that triple-transgenic pigs had high coincidence with human metabolic diseases, especially in the scope of inflammatory signaling at early stage metaflammation. Taken together, this study provides a valuable large animal model for the clinical study of metaflammation and metabolic diseases.Peer reviewe

    A Low-Power Area-Efficient Precision Scalable Multiplier with an Input Vector Systolic Structure

    No full text
    In this paper, a small-area low-power 64-bit integer multiplier is presented, which is suitable for portable devices or wireless applications. To save the area cost and power consumption, an input vector systolic (IVS) structure is proposed based on four 16-bit radix-8 Booth multipliers and a data input scheme is proposed to reduce the number of signal transitions. This structure is similar to a systolic array in matrix multiply units of a Convolutional Neural Network (CNN), but it reduces the number of processing elements by 3/4 concerning the same vector systolic accelerator in reference. The comparison results prove that the IVS multiplier reduces at least 61.9% of the area and 45.18% of the power over its counterparts. To increase the hardware resource utilization, a Transverse Carry Array (TCA) structure for Partial Products Accumulation (PPA) was designed by replacing the 32-bit adders with 3/17-bit adders in the 16-bit multipliers. The experiment results show that the optimization could lead to at least a 6.32% and 13.65% reduction in power consumption and area cost, respectively, compared to the standard 16-bit radix-8 Booth multiplier. In the end, the precise scale of the proposed IVS multiplier is discussed. Benefiting from the modular design, the IVS multiplier can be configured to support sixteen different kinds of multiplications at a step of 16 bits [16b, 32b, 48b, 64b] × [16b, 32b, 48b, 64b]
    corecore